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Abstract. We construct an optimal state merging protocol by adapting 
a recently-discovered optimal entanglement distillation protcol [Renes 
and Boileau, Phys. Rev. A . 73, 032335 (2008)]. The proof of optimality 
relies only on directly establishing sufficient "amplitude" and "phase" 
correlations between Alice and Bob and not on usual techniques of de- 
coupling Alice from the environment. This strengthens the intuition from 
quantum error-correction that these two correlations are all that really 
matter in two-party quantum information processing. 



1 Introduction 

Quantum state merging is an important primitive protocol in the hier- 
archy of quantum communication protocols, also known as the quantum 
information family tree. Given two parties Alice and Bob and a mixed 
bipartite state ip^^ , the goal of state merging is simply for Alice to send 
her half of the state to Bob. One option, of course, is to compress 
into as few qubits as possible and send it over a quantum channel. How- 
ever, this ignores the information Bob has about the state in the form of 
■0^. Although it might seem that a quantum channel is essential for state 
merging to work, Bob's side information can be such that only classical 
communication from Alice is required. 

Reasoning about the protocol is made somewhat easier by considering 
the purification lip)^^^ of ^'^■^ to a reference system R, that is ip^^ = 
Ttr [^"^^^j . The goal of state merging is then to arrange for Bob to hold 
the purification of R. In some cases quantum communication will clearly 
be required, for instance when 1^)"^^^ = where |^) is arbitrary 

while \<P)^^ = \ kk)^^ is the canonical maximally entangled state 

for a fixed basis {\k)} and d is the minimum dimension of A and R. Bob's 
state is clearly irrelevant, and Alice must simply send her whole system, 
as it is incompressible. On the other hand, when Alice and Bob share 
l^)"^^ , no communication is required at all! This is simply due to the fact 
that now the state of R is by itself pure, so neither Alice nor Bob hold 
its purification. 



Horodccki, Oppenheim, and Winter [1, 2] consider the asymptotic set- 
ting of many copies of ip^^^ and show that classical communication suf- 
fices when the quantum conditional entropy S{A\B) = S{AB) — S{B) is 
negative, where S{A) = — Tr [p^ log2 p^] is the von Neumann entropy. In 
fact, when S{A\B) < their state merging protocol produces entangled 
pairs at the rate —S{A\B) and uses classical communication at the rate 
I{A:R), where I{A:R) = S{A) + S{R) - S{AR) is the quantum mutual 
information. These rates are also shown to be optimal. When S{A\B) > 
on the other hand, any state merging protocol requires quantum commu- 
nication at the rate S{A\B), or equivalently consumes entangled pairs at 
this rate. This fact gives an operational meaning to the conditional en- 
tropy in terms of entanglement consumption or production, which due to 
its possible negativity is quite unlike its classical counterpart. 

In this paper we construct a state merging protocol operating at the 
optimal rates by focusing on the classical information that Bob has about 
complementary obscrvablcs "amplitude" and "phase" on Alice's system 
and showing how classical communication is sufficient to transfer the nec- 
essary quantum correlations. This approach is substantially different from 
the original proof, which is based on the technique of decoupling Alice's 
system from the reference system R [3], and follows our recent work on 
entanglement distillation (ED) quite closely [4]. Indeed, state merging is 
actually achieved in that protocol as well, but at the cost of too much 
classical communication. We rectify this problem here, showing that if 
Alice first compresses her system and then runs the ED protocol, a small 
modification suffices to make this an optimal state merging protocol. 

The remainder of the paper is outlined as follows. We first review the 
known results for the state merging protocol in the next section, and then 
recapitulate the important parts of the proof of the ED protocol appearing 
in [4] in the following section. Section 4 contains the new contribution 
of this paper, showing how to modify the ED protocol to use only the 
minimum necessary classical communication. Finally, we conclude with a 
summary of the results and comment on the connections to the quantum 
noisy channel coding theorem. 

2 State Merging Defined 

As with most protocols in quantum information theory, we are con- 
cerned here with the rate at which Alice and Bob can transform an 
asymptotically-large number of copies of the state [ip)^^^ into a good 
approximation of n copies in which Bob holds system A. To keep the ac- 



counting simple, we assume that any necessary quantum communication 
is performed by teleportation through pre-shared entangled pairs, so that 
the protocol uses only classical communication in any case, and either 
produces or consumes entanglement depending on the circumstances. We 
then define an (n, e) state merging protocol for ijj^^^ to be a series of lo- 
cal operations involving only classical communication (LOCC operations) 
such that application to {W')'^^^ = (|^)^sii^®n produces an output T^^^ 
in which Bob holds the system L> such that WT^^^-^^^^W^ < e. If there 
exists an (71, 6^) protocol using Kri bits of classical communication and 
consuming En ebits of entanglement for every n such that lim„_>oo = 0, 
then the rates of communication and entanglement consumption of the 
protocol are given by 

Kn En 

Rk = lim and Re = lim — . (1) 

n-+oo n n-+oo n 

Horodecki, Oppenheim, and Winter showed in [1,2] that 

uiiRK = I{A:R) and inf i?^ = 5(^1^), (2) 

where a negative Re indicates the amount of entanglement produced. The 
proof of these statements has two parts, the direct part showing the rates 
are achievable, and the converse part showing they cannot be surpassed. 
Here we will give a new proof of the direct part, borrowing our techniques 
from [4] which were used to give a new proof of the hashing inequality [5] 
on the achievable rate of entanglement distillation. In the next section we 
sketch the important parts of that proof. 

3 Entanglement Distillation Revisited 

A maximally entangled pair in one for which Bob can predict the measure- 
ment of either of the two observables, "amplitude" = {k\^ 
and its Fourier conjugate "phase" X = \ k®l){k\. Here we are assum- 
ing that Alice's system has dimension 2, but what follows can be easily 
extended to higher dimensions. Since this is the desired output of the dis- 
tillation procedure, the idea behind the protocol given in Theorem 6 of [4] 
is to determine what information Bob already has about these observables 
from his system B and then arrange for Alice to send him the rest. This 
is classical information, since it refers to the measurement outcomes, and 
therefore only classical communication will be required. However, since 
Alice needs to send information pertaining to both X and Z, one must en- 
sure that both parts of her message simultaneously exist. This is achieved 



by measuring the X- and Z-type stabilizers of a Calderbank-Shor-Steane 
(CSS) code [6-8] to generate the message. The amount of information is 
governed by the "static" version of the Holevo- Schumacher- Westmoreland 
(HSW) theorem [9, 10], which we review in the appendix. 

Greatly simplified, the protocol starts by Alice picking a random CSS 
code of a given size for her Hilbert space. She then measures the stabilizers 
to obtain the syndromes a (for X) and (3 (for Z) and communicates them 
to Bob. The syndromes are such that he can find measurements 
and on B which enable him to predict (with high probability) the 
outcome of measuring either X^ or Z^, respectively. The existence of such 
measurements is guaranteed by the (static) HSW theorem, using Bob's 
marginal states generated by Alice's measurement as the ensemble and the 
code syndrome as the side information. It implies that the CSS code must 
have roughly mz = nS{Z^\B) Z-typc syndromes and mx = nS{X'^\CB) 
X-type, where C is an additional quantum register containing a copy of 
Alice's system in the Z basis, and S{Z^\B) = S{i)^^) - S{il}^) for i;^^ 
the shared state after Alice measures the observable Z. Once this process 
is complete. Bob can (in principle) predict either X^ or Z^ on each pair, 
and therefore can perform a quantum operation on his systems to create 
entangled pairs (to good approximation) . Since Alice is left with only the 
code subspace given by a and /3, whose size is n — mx — mx, this is the 
number of entangled pairs they can create. 

To see how this works in more detail, begin with the individual shared 
state 1-0)^^^ and write it as = ^ ^/Pk\k)'^\v'k)^^, where is 

the eigenbasis of i/}'^ and also defines the operator Z, the \ipk) are a set 
of arbitrary orthonormal states, and pk is a probability distribution. The 
n-fold version \^o)^^^ = (|^)-4B_R^®n write like so, using bold-faced 
symbols k to denote strings {ki,k2, - ■ ■ , kn)' 

|'^^o)^^^ = VPi^i;|k)^|<^k)^^. (3) 

k 

We'll also need to consider the associated state in which Bob has a copy 
of Alice's system in the Z basis: 

Here \x) is an eigenstate of X and the l^i^^) are again a arbitrary set of 

orthonormal states. Observe that |i?o)'^'^^ = 

Denote the projections onto the stabilizers of the chosen CSS code by 
and lip, which commute by the CSS nature of the code. The result 



of Alice measuring the stabilizers and sending them to Bob is 



^^^^ABRP = Y,n^nj\W,)^^^\a,l3f. (4) 

The system label P, for "public" , is shorthand for having arbitrarily many 
copies Pi,P2,... of the values a, (3, and mimics the information being 
classically-transmitted. Given f3, Bob can coherently perform the mea- 
surement to extract the value of A; in ^ to an auxiliary system C 
with high probability. One can show that this implies the state is very 
nearly identical to 

W2) = E iT^iI^^|kk)^^|<^k)^^|a,^)^ = J2n^n^m\a,P)r (5) 

a,/3,k a,/3 

Next, Bob can coherently measure vl^.x to extract x in the conjugate 
basis of A to a further auxiliary system D, again with high probability. 
The resulting state is nearly identical to 

\^^) = ^ E n^n^\^)^\s.)^\^^f^^\a,p)r (6) 

a,/3,x 

Owing to the properties of X and Z and the two forms of {tpc), wc have the 
relation = EkVPkOJ^'""\^f\^-k)^^ = {Z^'fl^f^^. Inserting 

this into equation 6 gives 

\^^) = E n^n^\Sc)^\Scfiz-f\^of^^\c.,Pf. (7) 

a,/3,x 

Finally, a controlled-Z operation from D to C inverts the Z^ operator, 
leaving the desired output 

1^4) = E ^a^^l^n)^^|«,/3)^ ® Wof''^ (8) 
a,/3,x 

where = |<P)®". Observe that the purification of R is now solely in 
Bob's possession, so state merging has been accomplished. Furthermore, 
since n[S{Z^\B) + S{X^\CB)] CSS stabilizers leave n[l - S{Z^\B) - 
S{X^\CB)] encoded logical operators, Alice and Bob share this many 
entangled pairs in systems A and D. In [4] it is shown that this equals 
—nS{A\B), so provided this quantity is positive {S{A\B) < 0), the pro- 
tocol achieves the rate Re- 

Of course, is not precisely the output of the protocol, since the two 
coherent measurement operations by Bob were not perfect. The details of 



the approximation are given in [4] , the result being that if Alice chooses a 
random code having n[S{Z^\B)+S] Z-type stabilizers and n[S{X^\CB) + 
S] X-type stabilizers for some S > 0, then the output will be within 
exp(— 0(n(5^)) of llZ'4), as measured by the trace-distance. 

If S{A\B) > 0, we can use the same trick as [1, 2]. Adding n[S{A\B) + 
26] entangled pairs, each of which has S{A\B) = —1, the conditional 
entropy of the overall state \^)^^^\^n[S{A\B)+e])'^ ^ is —2nS. Using this 
as the individual input into the above protocol accomplishes the state 
merging and outputs no entanglement. In this way Re can be achieved 
when S{A\B) > 0. 

The above protocol requires too much classical communication, how- 
ever, n[l — S{A\B)] bits. This is generally greater than I{A:E), and is 
only equal for S{A) = 1. The fact that the protocol is optimal when 
is maximally mixed suggests that for a general input Alice should first 
compress her system and then run the protocol. However, the compression 
procedure will disturb the conjugate observable X and its eigenbasis, so 
there is no longer any guarantee that Bob's vIq^x measurement will work 
as intended. The next section shows how to fix this problem. 

4 Classical Communication Reduced 

Fortunately, the ensemble of states -d^^ which Bob would like to distin- 
guish is invariant under the action of the group [Z^]'^ , which will enable 
us to adapt the original tIq^x measurement for use after Alice compresses 
her state. This will reduce the number of X syndromes she needs to com- 
municate to Bob to the optimal level. 

The modified protocol begins as before with the state \^o)- Alice then 
makes a measurement projecting her systen onto the typical subspace 
TJ^, which is the subspace spanned by eigenvectors |k) whose k are in the 
typical set = {k : \ - ^ logpk - 5'(V'^)| < 6} for a fixed S > [11, 
12]. The probability Afg^ = Pr[k G T^] that k is typical is greater than 
1 — 2~"'^^ := 1 — e, for some constant c [5] and therefore the projec- 
tion succeeds with probability exponentially close to unity; otherwise the 
protocol aborts. When it succeeds, it prunes the state \^o), leaving 

1-^^)^^^ = ^ E vsik)^i^o^^ = E vSik)^i^o^^ (9) 

where we have implicitly defined new probability weights p'^ = pj^/Mg^. 
Importantly, Dg := dim(7^") < 2"^^^'^^'>'^^ , and a simple calculation 



shows that (iP'ol'^o) ~ \/^5- This imphes that two states are close in 
trace distance, HlZ'o — ^olli — V^' using the relationship between fidelity 
and trace distance ||p — cr||i < \/l — F(p, aY [13]. 

The protocol proceeds just as before, measuring X'- and Z'-typc sta- 
bilizers of a random CSS code on the pruned state and communicating the 
results to Bob. Here Z' is the analog of Z for the typical subspace, and X' 
is its Fourier conjugate. Now, however, we have no direct way of setting 
the number of stabilizers, since the state is no longer i.i.d. and therefore 
the HSW theorem no longer applies. This is not really a problem for the 
Z'-type stabilizers, since the typical projection is done in the |k) basis, the 
basis which generates the tp^. By design, the measurement constructed 
in the HSW theorem docs not attempt to identify (p^ for nontypical k, 
so Bob can just reuse it in this case. The probability of error will only 
decrease by explicitly rejecting nontypical k. Hence ruz ~ nS{Z^\B) as 
before. 

However, the original measurement will not work for the conjugate 

basis |x'), the Fourier transform of the typical subspace basis, since the 
states d'^,'^ have no a priori relation to the original 'd^^ ■ However, the 
former states stem from the related state 

K) = E vS|kk)^^|^k)^^ = ^ E I^^Vl'^x')^^^, (10) 



keTj" V d x' 

and this fact, coupled with the group covariance of both sets, gives us 

a means to transform A'^^ into a measurement A"~"^i suitable for distin- 
ct, x a,x' 

guishing the 7?^^. 

To see how this works, it is easiest to go back to the proof of the 
HSW theorem, which for convenience is stated in the appendix. In the 
original i.i.d. case, the projectors Px and P^^ onto the typical subspaces 
of ^nd = ^X^x'^^x'^' respectively, fulfill the five conditions 

needed in the proof of the theorem, equations 17 through 21. Since 
{Z^)^^Q^{Z^)'-^, the same holds for P^^, and the five conditions become 

TV[^^^(l^^-P^^)]<e (11) 
Tr[W^^{l^^ - P^^)] < e (12) 



Pl^B < ^. . ^CB (13) 
^^^^<d-^^^ (14) 



X 

pCB^^CBpCB^^^^y^^ (15) 



with e =, r = 2"'['^(^'^^)+^! , d = 2" (and the condition is an equahty since 

all X are typical), A = 2-"[5('^'''')-51. Our aim is now to find a set of new 

projectors and P"^^ fulfilling these conditions for the states -d'^^ 

and^'Crs^^^^^^/CB 

To start, use the fact that Tr[{%^^ -^^^)P^^] < \\%^^ -^o^\\i < 
\/e, since the trace distance is equal to the maximum of the lefthand side, 
maximized over all projectors [8]. Then we have 

Tr [(1 - P^^plf^] < Tr [(1 - P^^)W^^^] + \\%^^ - <e + ^/^, 

and so we can define P^,^ = {Z"^)'^ P^^ {Z'^)'^ to satisfy the first condi- 
tion. The second condition follows analogously upon noting that ^'•'^ = 
X^k?'k|k)(k|'-' (8) ip^ (and similarly for the pruned version) and therefore 
— I?!! < 2(1 —Mg") < 2e. The third condition remains as is, since we're 
using the same Pq, and the fourth is an equality when d = Dg. For the 
fifth condition, observe that 

o ' 

Therefore, pCB^iCBpCB < _i_pCB^CB pCB ^ ^^^^^ i^^^^ immediately 

to WP^'^^'^^P^'^Woo < \/Ml' < A(l + 2e). 

We thus have all the ingredients needed to construct the required 
measurement, with e' = 2y^, r' = r, d' = Dg, and A' = A(l + 2e). The 
number of syndromes Bob needs from Alice is given by m'^ > n[S{t/j^^) + 
S'(V'^)-S'(??'^-^)+3(5]+log(l+2e), which works out to be « n[S{ij^)- 
IlkPh^ifk)]- Since the pruned state is nearly identical to the original 
state, the remainder of the protocol goes through as before, outputting 
roughly nS{A)—mz—m'x entangled pairs. A simple calculation (along the 
lines of lemma 2 in [4]) gives m'-^ + mz = I{A:E) and nS{A) — m'-^ — mz = 
~S{A\B), and thus the protocol is optimal. 

5 Conclusion 

We have shown how to construct an optimal state merging protocol by fol- 
lowing the intuition from quantum error-correction that what really mat- 
ters in two-party quantum information processing is information about 
amplitude and phase measurements. Combining entanglement distillation 
with teleportation, our results also imply a new proof of the direct part 
of the noisy channel coding theorem [5] , one not following the usual route 



of decoupling Alice's system from the purification R (e.g. all the fully 
fleshed-out proofs to date [14-18]). It would be interesting to apply these 
techniques to more protocols, and see how far this intuition about quan- 
tum information extends. 
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A Static HSW Theorem 

Here we are interested in the "static" setting of the HSW theorem, which 
is concerned with the following. Given n samples from an ensemble {pk, Pk^l 
with average p = YlkPkPk^ what is the smallest amount of side informa- 
tion t = /(k) required in order to reliably construct a measurement yl^^k 
which will identify k from with only a small probability of error? In 
order to match the setting in the main text, we can think of the ensemble 
as arising from the state ■i/'"^^ = YlikPhim^l^ ® Pk ^ measurement of \k) 
(or Z^) on A generating state pk- For random CSS codes / is a random 
linear function, resulting from measuring the stabilizer observables on the 
state |k). However, in what follows we will consider universal hashing [19], 
since it is no more difficult to do so. In universal (or 2-universal) hash- 
ing, the function / : {0, 1}" {0, 1}"* generating the side information is 
chosen at random from a universal family of hash functions in which the 
probability of collision f{x) = f{y) but x ^ y is the same as for random 
functions: Pr/[/(x) = f{y)\x ^ y] < 1/2"*. 

In [4] we proved that for a fixed 6 > 0, choosing m = n[S{Z^\B) + A5] 
is sufficient to guarantee the existence of a measurement having elements 
^/(k),^ such that the probability of error Pg is exponentially small: 



A crucial step in the proof is to show the existence of projectors Qk and 
Q such that 




(16) 



Tr[(pk)k(l-Q)]<e 
(TV[pk(l - gk)])k < e 
Qk<r 



(17) 
(18) 
(19) 

(20) 




keT^ 



||Q(Pk)kQI|oc< A, 



(21) 



after which it can be shown that m > [- log rdX\ for < 7 < 1 suffices 
to construct the measurement.^ In the i.i.d. case of the HSW theorem, 
the Qk and Q are projectors onto the typical subspaces of pk (for typical 
k) and p®", respectively, for which e = 2-'^^\ r = T^^'^kPkS{Pk)+S] ^ = 
2n[//{Pfe)+5] ^ and A = 2-"['^(^)-^l . Thus, one chooses m > n[H{pk) - S{p) + 
EkPkSiPk) + M = n[S{Z''\B)+4d]. 
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